以前的无监督句子嵌入研究集中在数据增强方法上,例如辍学和基于规则的句子转换方法。但是,这些方法限制了控制句子增强观点的细粒语义。这导致监督信号不足以捕获类似句子的语义相似性。在这项工作中,我们发现使用邻居句子可以捕获相似句子之间更准确的语义相似性。基于这一发现,我们提出了RankEncoder,该发现使用了输入句子和语料库中的句子之间的关系来训练无监督的句子编码器。我们从三个角度评估rankencoder:1)语义文本相似性性能,2)相似句子对的功效,以及3)rankencoder的普遍性。实验结果表明,与先前的最新性能相比,Rankencoder达到80.07 \%Spearman的相关性,绝​​对提高了1.1%。在类似的句子对上,改进更加显着,改善了1.73%。另外,我们证明了RankEncoder普遍适用于现有的无监督句子编码器。
translated by 谷歌翻译
Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.
translated by 谷歌翻译
Personalization in Federated Learning (FL) aims to modify a collaboratively trained global model according to each client. Current approaches to personalization in FL are at a coarse granularity, i.e. all the input instances of a client use the same personalized model. This ignores the fact that some instances are more accurately handled by the global model due to better generalizability. To address this challenge, this work proposes Flow, a fine-grained stateless personalized FL approach. Flow creates dynamic personalized models by learning a routing mechanism that determines whether an input instance prefers the local parameters or its global counterpart. Thus, Flow introduces per-instance routing in addition to leveraging per-client personalization to improve accuracies at each client. Further, Flow is stateless which makes it unnecessary for a client to retain its personalized state across FL rounds. This makes Flow practical for large-scale FL settings and friendly to newly joined clients. Evaluations on Stackoverflow, Reddit, and EMNIST datasets demonstrate the superiority in prediction accuracy of Flow over state-of-the-art non-personalized and only per-client personalized approaches to FL.
translated by 谷歌翻译
We present RecD (Recommendation Deduplication), a suite of end-to-end infrastructure optimizations across the Deep Learning Recommendation Model (DLRM) training pipeline. RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets. Feature duplication arises because DLRM datasets are generated from interactions. While each user session can generate multiple training samples, many features' values do not change across these samples. We demonstrate how RecD exploits this property, end-to-end, across a deployed training pipeline. RecD optimizes data generation pipelines to decrease dataset storage and preprocessing resource demands and to maximize duplication within a training batch. RecD introduces a new tensor format, InverseKeyedJaggedTensors (IKJTs), to deduplicate feature values in each batch. We show how DLRM model architectures can leverage IKJTs to drastically increase training throughput. RecD improves the training and preprocessing throughput and storage efficiency by up to 2.49x, 1.79x, and 3.71x, respectively, in an industry-scale DLRM training system.
translated by 谷歌翻译
Despite the huge advancement in knowledge discovery and data mining techniques, the X-ray diffraction (XRD) analysis process has mostly remained untouched and still involves manual investigation, comparison, and verification. Due to the large volume of XRD samples from high-throughput XRD experiments, it has become impossible for domain scientists to process them manually. Recently, they have started leveraging standard clustering techniques, to reduce the XRD pattern representations requiring manual efforts for labeling and verification. Nevertheless, these standard clustering techniques do not handle problem-specific aspects such as peak shifting, adjacent peaks, background noise, and mixed phases; hence, resulting in incorrect composition-phase diagrams that complicate further steps. Here, we leverage data mining techniques along with domain expertise to handle these issues. In this paper, we introduce an incremental phase mapping approach based on binary peak representations using a new threshold based fuzzy dissimilarity measure. The proposed approach first applies an incremental phase computation algorithm on discrete binary peak representation of XRD samples, followed by hierarchical clustering or manual merging of similar pure phases to obtain the final composition-phase diagram. We evaluate our method on the composition space of two ternary alloy systems- Co-Ni-Ta and Co-Ti-Ta. Our results are verified by domain scientists and closely resembles the manually computed ground-truth composition-phase diagrams. The proposed approach takes us closer towards achieving the goal of complete end-to-end automated XRD analysis.
translated by 谷歌翻译
当一家企业向另一家企业(B2B)出售时,购买业务由一组称为帐户的个人代表,他们共同决定是否购买。卖方向每个人做广告,并与他们互动,主要是通过数字方式进行的。销售周期很长,通常在几个月内。在寻求信息时,属于帐户的个人之间存在异质性,因此卖方需要在漫长的视野中对每个人的利益进行评分,以决定必须达到哪些人以及何时达到。此外,购买决定与帐户有关,必须进行评分才能投射购买的可能性,这一决定可能会一直变化,直到实际的决定,象征组决策。我们以动态的方式为帐户及其个人的决定分数。动态评分允许机会在长时间的不同时间点影响不同的单个成员。数据集包含与卖方的每个人通信活动的行为日志;但是,没有关于个人之间咨询的数据,这导致了决定。使用神经网络体系结构,我们提出了几种方法来汇总各个成员活动的信息,以预测该小组的集体决策。多次评估发现了强大的模型性能。
translated by 谷歌翻译
在数据驱动的社会的时代,物联网(IoT)设备的无处不在,存储在不同地方的大量数据,分布式学习已获得了很多吸引力,但是,假设具有独立和相同分布的数据(IID)跨设备。在放松这种假设的同时,由于设备的异质性质,无论如何都无法实现现实,但Federated Learnation(FL)已成为一种保护隐私的解决方案,可以训练与大量设备分布的非IID数据进行协作模型。但是,由于不受限制的参与,打算破坏FL模型的恶意设备(攻击者)的出现是不可避免的。在这项工作中,我们旨在确定此类攻击者并减轻对模型的影响,从本质上讲,在双向标签与勾结的翻转攻击的情况下。我们通过利用本地模型之间的相关性来提出两种基于最小生成树和k-densest图的理论算法。即使攻击者最多占所有客户的70%,我们的FL模型也会消除攻击者的影响力,而先前的作品不能负担超过50%的客户作为攻击者。通过在两个基准数据集(即Mnist和Fashion-Mnist)的实验中确定我们算法的有效性,并具有压倒性的攻击者。我们使用准确性,攻击成功率和早期检测回合建立了算法优于现有算法的优势。
translated by 谷歌翻译
Smart Sensing提供了一种更轻松,方便的数据驱动机制,用于在建筑环境中监视和控制。建筑环境中生成的数据对隐私敏感且有限。 Federated Learning是一个新兴的范式,可在多个参与者之间提供隐私的合作,以进行模型培训,而无需共享私人和有限的数据。参与者数据集中的嘈杂标签降低了表现,并增加了联合学习收敛的通信巡回赛数量。如此大的沟通回合需要更多的时间和精力来训练模型。在本文中,我们提出了一种联合学习方法,以抑制每个参与者数据集中嘈杂标签的不平等分布。该方法首先估计每个参与者数据集的噪声比,并使用服务器数据集将噪声比归一化。所提出的方法可以处理服务器数据集中的偏差,并最大程度地减少其对参与者数据集的影响。接下来,我们使用每个参与者的归一化噪声比和影响来计算参与者的最佳加权贡献。我们进一步得出表达式,以估计提出方法收敛所需的通信回合数。最后,实验结果证明了拟议方法对现有技术的有效性,从交流回合和在建筑环境中实现了性能。
translated by 谷歌翻译
随着人们的生活水平的增强和通信技术的快速增长,住宅环境变得聪明且连接,从而大大增加了整体能源消耗。由于家用电器是主要的能源消费者,因此他们的认可对于避免无人看管的用途至关重要,从而节省了能源并使智能环境更可持续。传统上,通过从客户(消费者)收集通过智能插头记录的电力消耗数据,在中央服务器(服务提供商)中培训设备识别模型,从而导致隐私漏洞。除此之外,当设备连接到非指定的智能插头时,数据易受嘈杂的标签。在共同解决这些问题的同时,我们提出了一种新型的联合学习方法来识别设备识别,即Fedar+,即使使用错误的培训数据,也可以以隐私的方式跨客户进行分散的模型培训。 Fedar+引入了一种自适应噪声处理方法,本质上是包含权重和标签分布的关节损耗函数,以增强设备识别模型的能力,以抵制嘈杂标签。通过将智能插头部署在公寓大楼中,我们收集了一个标记的数据集,该数据集以及两个现有数据集可用于评估Fedar+的性能。实验结果表明,我们的方法可以有效地处理高达$ 30 \%$的嘈杂标签,同时以较大的准确性优于先前的解决方案。
translated by 谷歌翻译
学习在线推荐模型的关键挑战之一是时间域移动,这会导致培训与测试数据分布之间的不匹配以及域的概括错误。为了克服,我们建议学习一个未来的梯度生成器,该生成器可以预测培训未来数据分配的梯度信息,以便可以对建议模型进行培训,就像我们能够展望其部署的未来一样。与批处理更新相比,我们的理论表明,所提出的算法达到了较小的时间域概括误差,该误差通过梯度变异项在局部遗憾中衡量。我们通过与各种代表性基线进行比较来证明经验优势。
translated by 谷歌翻译